Picture for Hao Peng

Hao Peng

Beihang University

Kwai Keye-VL Technical Report

Add code
Jul 02, 2025
Viaarxiv icon

Imitation Learning for Satellite Attitude Control under Unknown Perturbations

Add code
Jul 01, 2025
Viaarxiv icon

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Add code
Jun 11, 2025
Viaarxiv icon

STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation

Add code
Jun 11, 2025
Viaarxiv icon

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Add code
May 28, 2025
Viaarxiv icon

AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Add code
May 22, 2025
Viaarxiv icon

The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning

Add code
May 21, 2025
Viaarxiv icon

Unsupervised Graph Clustering with Deep Structural Entropy

Add code
May 20, 2025
Viaarxiv icon

mCLM: A Function-Infused and Synthesis-Friendly Modular Chemical Language Model

Add code
May 18, 2025
Viaarxiv icon

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models

Add code
May 16, 2025
Viaarxiv icon